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(54) Dynamic rate control scheduler for ATM networks 



(57) A Dynamic Rate Control (DRC) scheduler for 
scheduling cells for service in a generic Asynchronous 
Transfer Mode (ATM) switch is disclosed. According to 
the inventive DRC, each traffic stream associated with 
an internal switch queue is rate-shaped according to a 
rate which consists of a minimum guaranteed rate and a 
dynamic component conputed based on congestion 
information within the switch. While achieving high utili- 
zation, D RC guarantees a minimum throughput for each 
stream arKi fairly distributes unused bandwidth. The dis- 
tribution of unused bandwidth in DRC can be assigned 
flexibly, i.e.. the unused bandwidth need not be shared 
in proportion to tiie minimum throughput guarantees, as 
in weighted fair share schedulers. Moreover, an effec- 
tive closed-loop QoS control can be built into DRC by 
dynamically updating a set of weights based on 
observed QoS. Another salient feature of DRC is its 
ability to control congestion internal congestion at bot- 
tleneck points within a multistage switch. DRC can also 
be extended beyond the local switch in a hop-by-hop 
fashion. 
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BACKGROUND OF THE INVENTION 

1 . Field of the Invention 

[0001 ] The present invention relates to a control scheduler for an ATM network and, more specifically, to a scheduler 
which guarantees minimum rate of transmission, while fairly distributes any unused bandwidth. 

2. Description of Related Art 

[0002] High-speed networks based on the Asynchronous Transfer Mode (ATM) are expected to carry services with a 
wide range of traffic characteristics and quality-of-service (QoS) requirements. For example, in audio transmission a 
cell IS useless to the receiver if it is delayed beyond the specified rate. On the other hand, video transmission is very 
bursty and, unless shaped at the entry point, may cause temporary congestion and delay other cells. Integrating all 
services in one network with a uniform transport mechanism can potentially sinrplify network operation and improve 
network efficiency. In order to realize these potential benefits, an efficient and lair means of allocating the network 
resources is essential. 

[0003] A central problem in allocating the network resources is the manner in which the service to the various users 
is prioritized. A simple model is to use a First In First Out (FIFO) algorithm. In a simple First-ln First-Out (FIFO) sched- 
uler, there is no way of guaranteeing that each stream gets its assigned rate. During some interval of time, a given 
stream may transmit at a rate higher than its assigned rate Mj. and thereby steal bandwidth from other streams which 
are transmitting at or below their assigned rates. This problem led to the development of various mechanisms for shap- 
ing the entry to the network, such as the known leaky bucket algorithm. For example, the output stream for each queue 
can be peak rate shaped to a predetermined rate Mp. 

[0004] Figure 1 shows a static rate control (SRC) scheduler with N-stream queues. SQl . SQ2 ... SON one queue 
corresponding to each stream. The SRC scheduler serves a queue i at the constant rate Mj and the output cell streams 
are fed to a common bottleneck queue CQ which is served at a given rate C. Service from the common queue CO cor- 
responds to cell transmission over a link of capacity C. 

[0005] Rate-shaping transforms the streams into constant rate streams (assuming all queues are continuously back- 
logged). Considering the relationship 

N 

XMi<C. (^) 



(to be developed further below), the bottleneck queue wilt be stable: in fact, the maximum queue length is N. In fact 
strict inequality in (1) will usually hoW. implying that the cell delay in the common queue will be small with high proba- 
bility. Although the service discipline depicted in Figu-e 1 is work-conserving with respect to the stream queues, it is 
non-work-conserving with respect to the common queue, since it Is possible that the common queue may go enpty 
even when at least one of the stream queues is non-empty. This scheduler is similar to a circuit-switched system except 
for the asynchronous nature of the cell streams. 

[0006] If the rates. Mj. have been conputed correctly based on the stream traffic characteristics and QoS require- 
ments, the minimum rate scheduler should succeed in guaranteeing QoS for all of the streams. However, because this 
scheduler is non-work-conserving with respect to the common queue, bandwidth could be wasted for one of two rea- 
sons: 

• The CAC algorithm was optimistic in its computation of Mj. It may be the case that a bandwidth of Mj + A over short 
intervals of time is required to ensure that QoS is met for stream i. 

• The traffic stream could include low priority cells, with the cell loss priority (CLP) bit set to one. 

In the first case, a stream shoukJ be allowed to make use of bandwidth beyond its allocated rate. Mj. if the bandwidth is 
available. In the second case, the QoS guarantee applies only to cells that conform to the negotiated traffic contract 
I.e.. cells with cell loss priority (CLP) set to zero However, if bandwidth is available, a stream should be permitted to 
transmit nonconforming cells, i.e.. cells tagged as CLP=1 cells, over and above the allocated minimum rate for CLP=0 
cells. If bandwidth is not available. CLP=1 cells should be dropped before CLP=0 cells; i.e., there should be a lower 
threshold for dropping CLP=1 cells. (As is known in the art. when a source transmits at a rate higher than the negotiated 
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rate, its violating cells are tagg^ by setting their GLP to 1 .) 

[0007] Clearly, the protslem with the mininnum rate scheduler, is that streams cannot make use of excess bandwidth 
even when it is available. In minimum rate scheduling, there is no statistical multiplexing among cells belonging to dif- 
ferent streams. (As is known in the art, statical multiplexing takes into account "economies of scale," i.e.. the band- 
5 width it takes to transmit all the streams together is less than the sum of the individual bandwidths required to transmit 
each stream.) A simple way to enhance this scheme is to provide meare for serving a cell from a non-empty queue 
whenever t^andwidth is availat>le. During a cell time, if the common queue is empty, the sdieduler services a celt from 
one of the non-«npty stream queues. 

[0008] According to another prior art method, the quaie selection is done in a round-robin fashion, and the excess 
10 bandwidth is shared equally among the active streams. A disadvantage of such a scheduler is that queues are served 
without regard to QoS. That is. the bandwidth is alternated sequentially to the queues without regard to the urgency of 
transmission, i.e.. requested minimum rate, of any specific source. Therefore, this method does not lend itself well for 
servir^ different classes having different QoS requirements. 

[0009] Accordingly, there has been considerable irrterest in packet scheduling algorithms which are intended to pro- 
15 vide weigtited shares of the bandwidth on a common link to competing traffic streams, so as to enable service of differ- 
ent classes. With slightly more complexity, the excess bandwidth can be shared using Weighted Round* Robin (WRR), 
Weighted Fair Queuing (WFQ), and Virtual ClocK and their variants, which attempt to approximate the idealized Gen- 
eralized Processor Sharing (GPS) scheduling, assuming a fluid model of traffic. For WRR» see M. Katevenis. S. Sidi- 
ropoulos. and C. Courcoubetis. Weighted Round-Robin Cell Multiplexing in a General -Purpose ATT\/1 Switch, IEEE 
20 JSAC, Vol 9. pp. 1265-1279. October 1991. For WFQ. see. A.K. Parekh and R.G. Gallager, A generalized Processor 
Sharing Approach to Flow Control in Integrated Service Networks: The Single-Node Case, IEEE/ACM Trans, on Net- 
working, vol. 1 , pp. 344-357, June 1993. For Virtual Clock, see. L Zhang, Virtual Clock: A New Traffic Control Algorithm 
for Packet Switching, ACM Trans, on Computer Systems, vol. 9, pp. 101-124, May 1991. In these schedulers, each 
stream is assigned a weight corresponding to the QoS requested by a user of the stream. Accordingly, over an interval 
25 of time in which the number of active streams is fixed, the bandwidth received by an active stream should be roughly 
proportional to the assigned weight. 

[001 0] By an appropriate assignment of weights, each stream can be provided with a share of the link bandwidth that 
is proportional to its weight. Hence, each stream receives a minimum bandwidth guarantee. If a stream cannot make 
use of all of its guaranteed bandwidth, the excess bandwidth is shared annong the active streams in proportion to the 

30 weights. However, a stream with a larger weight will not only receive a higher barKiwidth guarantee, but also receive 
larger shares of the available bandwidth than streams with smaller weights. Thus, the weight assigned to a connection 
determines not only its minimum bandwidth guarantee, but also its share of the available unused bandwidth. 
[0011] In this specification the term "weighted fair share scheduler" is used generally to refer to a general class of 
work-conserving schedulers which schedule cells so as to give each stream a share of the link bandwidth which is 

35 approximately proportional to a pre-assigned weight. A work-conserving scheduler transmits a cell over the link when- 
ever there is at least one cell in queue. Thus, a work-conserving scheduler basically determines the order in which 
queued cells shouki be serviced. The operation of such a scheduler is described in the folkywing. 
[0012] Consider an idealized fluid model for each traffic stream. Let W; be the weight assigned to stream I. At time t, 
the Generalized Processor Sharing (GPS) discipline serves stream i at rate: 

40 

Ri(t) = -^C.i e A(t), (2) 

jeA(t) 

45 

where A(t) is the set of backlogged streams at time t. Thus, each stream always receives a share of the available band- 
width which is proportional to its weight. Because of the discrete nature of cells or packets, a real scheduler can only 
approximate GPS scheduling, PGPS (Packet-by packet Generalized Processor Sharing), also known as Weighted Fair 
Queuing (WFQ) noted above, and its variants (cf. S.J. Golestani. A Self-Clocked Fair Queuing Scheme for Broadband 
so Applications, in IEEE INFOCOM '94. Toronto. June 1994; and J. C. R. Bennett arxJ H. Zhang. WF2Q: Worst-Case Fair 
Weighted Fair Queuing, in IEEE INFOCOM *96. San Francisco, pp. 120-128, March 1996) are schedulers wtiich 
approximate GPS for packet scheduling. Other examples of scheduling schema which attempt to achieve fair sharing 
are the Virtual Clock and weighted Round-Robin noted above. Several other weighted tair share schedulers have been 
proposed in the literature. 

55 [0013] It should be appreciated that the assigned weight and the ojrrent usage of the network would determine 
whether a QoS requested by an incoming call can be guaranteed. Therefore, various Connection Admission Control 
(CAC) algorithms have been developed which decline service when the QoS canrrat be guaranteed. For that matter, the 
CAC algorithm must be able to predict the load on the system, including the newly received call if admitted. Therefore. 



3 



EP 0 901 301 A2 

delay bounds have been found for WRR, WFQ. Virtual Clock and other fair share packet scheduling algorrthms Using 
these delay bounds, admission control schemes can be devised to provide worst-case delay guarantees The delay 
bounds are typirally obtained by assuming worst-case behavior for streams controlled by leaky bucket-type open-loop 
flow control mechanisms. However, a problem in such an algorithm is that the calculated bounds tend to be rather loose 
since worst-case deterministic assumptions are made in obtaining the bounds. 

[0014] Another problem with the prior art schedulers is as follows. Conventionally, schedulers have been designed so 
that they are work-conserving with respect to the stream queues, in the sense that whenever link bandwidth is available 
and a packet is in the queue, a packet will be transmitted over the link. In other words, if a packet is available for trans- 
mission and there is sufficient bandwidth, the packet will be transmitted and the scheduler will not be idle "nie work- 
conserving approach has been promoted in the prior art since it presumably results in the highest possible utilization 
over the link. 

10015] However, within a switching system or the network, there may be several bottlenecks. For example some of 
the strean^ may be bottlenecked at a downstream link at another stage within the switch or the network. In this case 
providing these streams more bandwidth than their minimum guaranteed rates (when bandwidth is available) couW 
©cacerbate the congestion at the downstream bottleneck. Such congestion cannot be alleviated by the prior art sched- 
ulers because they are work-conserving with respect to a single bottleneck, servicing cells only in accordance with the 
availabtebandwidth at this bottleneck. That is. conventional weighted fair share schedulers always ensure that excess 
bandwKrti IS utilized and that the share of excess bandwidth made available to each queue is proportional to its weight 
ixjt they do no exercise control on the absolute value of the rate received at a bottleneck point * 
[0016] /Vdditionally. if there is a downstream bottleneck, typically backpressure signals wrtiich throttle upstream traffk: 
are used to alleviate congestion. However, backpressure signals are susceptible to on/off oscillations, resulting in 
higher ce^l delay variation (CDV) and. more significantly, loss of throughput due to the equalization of bandwidth distri- 
txition. Tfiat IS. in the prior art when a buffer reaches its limit, a backpressure signal is sent to the source. Upon receiving 
the signal the source wouW stop transmission until the buffer signals that the pressure was relieved However at that 
time rt is likely that all the sources would start transmission again concun-entiy. thereby overloading the buffer again so 
that the backpressure signal is again generated. Therefore, the system may oscillate for sometime causing large vari- 
ation in cell delay. Additionally, since all the sources would stop and start transmission at the same time, the throughput 
would be equalized irrespective of the QoS requested by each source. 

[00171 Since weighted fair share schedulers schedule cells only with respect to a single bottleneck, throughput for a 
cell stream may suffer because of backpressure resulting from downstream congestion. Hence, it may not be possible 
to guarantee a minimum throughput in this case. Consequently, while the prior art weighted share scheduler is work- 
oonserving with respect to a bottlenedc link, it may be non-workKX)nserving with respect to a downstream bottleneck 
Thus, the present inventors have determined that work-consen/ation is not always a desirable property and may lead 
to further congestion downstream. k k j / «»j 

[001« Yet another problem with the prior art weighted fair scheduling is that they necessitate an algorithm for search- 
ing and sorting out the timestamps applied to the cells in order to determine the next queue to service. More specifically 
in the prior art the timestamps are relative, i.e.. the scheduler needs to continuously order the cells according to their 
timestamp. For example, the scheduler may order the cells according to the length of the timestairp or according to the 
time remaining before the cell would be discarded. Such calculations may slow down the scheduler 
!°*"!L '^ IS known in the art. the ATM Forum has established four main classes of traffic, generally divkled into real 
time traffic and non-real time traftc. Constant Bit Rate (CBR) and Variable Bit Rate (VBR) are used for real time tiafffc 
Available Bit Rate (ABR) and Unspecified Bit Rate (UBR) are non-real time traffic, and are mainly 
used for OTmputer communication. As can be appreciated. ABR traffic has no minimum rate requirement and the main 
goal in scheduling ABR cells is to "pump" as many cells as possible using the available bit rate 

'^^"*'P''<^<«al-derivative(PD) controller for ABR sen^ice has been proposed in A Kolarov and G Rama- 
°^ * ^ ^ S«^*ce. in Proc. IEEE INFOGOM -97. Kobe Japan April 

1997. The scheduler is implemented on a network-wide basis using resource management (RM) cells. Generally, the 
source generates RMCs which propagatethrough the network. As each RMC cell passes through a switch, it is updated 
0.11)?"^^ the supportable rate. i.e.. the rate the source should transmit the data (generally called explicit rate). These 
RMC cells are fed back to the source so that the source may adjust its transmission rate accordingly. However, the prop- 
agation of tfie RMCs through the network causes a large delay in controlling the source. While such delay is acceptable 
for schedulingABR cells, it is unacceptable for scheduling real time traffic. Moreover, the delay need to be accounted 
tor by the scheduler, which complicates the computations and slows down the scheduler. 

55 SUMMARY OF THE INVENTION 

[0021] The pr^ent invention provides a new scheduling scheme which uses statistical approaches to admission con- 
trol so as to provide much higher utilizations, while maintaining the guaranteed QoS. The general concept of the present 
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invention is to construct the rate from two conrrponents. Herein, the two components will be called first and second 
schedule factors. In this event the first schedule factor is a fixed factor determined for each queue while the second 
schedule factor, a variable factor determined by a relative relationship between each queue and the remaining queues. 
Specif icaJly, the first schedule factor may be a minimum guaranteed rate while the secorxj schedule factor may be a por- 
5 tion of the unused bandwidth (rate). Constructing the rate from two components allows the scheduler to operate under 
at least three modes: (1) full available rate (i.e., minimum guaranteed rate plus a portion of tiie unused bandwidth). (2) 
minimum guaranteed rate, and (3) halt trananission (with very small probability). In its preferred form, inventive sched- 
uling scheme deccuiples the minimum guaranteed rate from the portion of unused bandwidth and is called Dynamic 
Rate Control (DRC). 

10 [0022] The DRC first distributes the bandwidth so as to ^jpport the guaranteed QoS. i.e., it supports the minimum 
guaranteed rate. Then, the DRC distributes any unused bandwidth to users, teased upon a criteria which, in the pre- 
ferred embodiment, is indep&ident of the minimum rate guaranteed to the users. A notatMe feature of the inventive DRC 
is that it is not necessarily work conserving, but rather takes into account bottlenecks downstream in determining 
whether to allocate unused t)andwidth. 

15 [0023] As noted atx)ve. a disadvantage of the prior art weighted fair sharing is that the entire bandwidth Is allocated 
according to the assigned weight. However, it might be desirable to determine the service rate according to: 



20 



Rj(t) = Mi 



E(t). 



(3) 



where, in general. Wf^Wy In tine above equation, the minimum rate guarantee and the excess bandwidth for a stream 
are decoupled. This decoupling allows the network provider to distribute the unused bandwidth independerrtiy of the 
25 minimum guaranteed rates. The inventive DRC scheduler naturally decouples the minimum rate guarantee from the 
excess bandwidth allocated to a stream. Weights can be assigned on a per^lass basis by the CAC or dynamically via 
a dosed-loop QoS control mechanism. 

[0024] Thus, for example, for UBR it may be preferable to assign very small or even zero guaranteed minimum rate, 
but to provide a large portion of the available bandwtdtii. This will help satisfy many real-time calls, while providing serv- 

30 ice for non-real time UBR when there is bandwidth available. 

[0025] Also noted above is that since work-conserving schedulers transmit a cell whenever there is at least one cell 
in a queue, it only determines the order in which queued cells shoukj be serviced. By contrast, a non-work-conserving 
scheduler may allow a cell time on tiie link to go idle even if there are cells in queue. Hence, in addition to the ordering 
of cells for service, timing is also important in non-work-conserving schedulers. Therefore, the present inventors have 

35 developed mechanisms to account for both ordering and timing of cell transmission. However, unlike the prior art. in the 
present invention the timestamp are absolute, rather than relative. That is. at any given current time, CT any cell having 
a timestamp which equals the current time is eligible for service. Therefore, there is no need for constant ordering of the 
cells according to timestamps. 

40 BRIEF DESCRIPTION OF THE DRAWINGS 

[0026] 

Figure 1 is a schematic illustrating a static rate control scheduler aocordir^ to the prior art. 
45 Figure 2 is a schematic illustrating the dynamic rate control according to the present invention. 

Rgure 3 is a block diagram depicting an embodiment of an inventive controller based on matching target utilization. 
Figure 4 is a block diagram depicting an embodiment of an inverrtive controller based on matching target queue 
length. 

Figure 5 is a schematic illustrating the inventive DRC scheduling witii overload control. 
50 Figure 6 is a schematic illustrating the inventive DRC scheduling with multiple bottlenecks. 

Figure 7 is a schematic illustrating a rate-shapir^ scheduler structure for per class queuing. 

Figure 8 is a schematic illusti^ng a rate-shaping scheduler structure for per virtual channel queuing. 

Figure 9 is a schematic illustrating an input output buffered switch. 

55 DESCRIPTION OF THE PREFERRED EMBODIMENTS 

[0027] A general aspect of the present invention is the provision of a rate which includes two components: a minimum 
guaranteed rate and a share of tiie excess bandwidth. This allows the scheduler to provkje service according to various 
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10 



QoS requirements and to shape the rates while accounting for downstream botttenecks. 

[0028] Urilike GPS-type schedulers, which distribute the entire bandwidth according to assigned weights the inven- 
tive scheduler first services the minimum rate and then may or may not distribute the unused bandwidth. In its simpler 
version, this approach can be derived as follows. 

[0029] A fair share scheduler provides stream i with a minimum bandwidth guarantee, which is a function of the entire 
available bandwidth, i.e., 



Wj 

^\=-N ^- (4) 

j=1 



Clearly. 

15 



N 

SM, = C. 



(5) 



20 



10030] However, it is preferable to separate the rate Into two components: the share of the bandwidth as provided by 
the minimum rate, plus the share of the unused bandwidth. When A(t) Is the set of active streams, the two components 
rate can be written as: 



j=1 ieA(t) 

30 



= W, 



35 
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C + Wj 










C 




N + 










ieA{t) 
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40 ieA(t) 

Where 



(6) 



= Mi + !— E(t). I 6 A(t) (7) 



45 Z «j 

E(t) = i^C. (8) 



50 



= X "^j (9) 

J«A(t) 



= Z'^j- (10) 

55 jeA 

In Equation (8). E(t) is representative of the excess or the unused bandwidth available at time t. Referring to Eq (6) we 
see that the rate at which stream i is served is the sum of the minimum guaranteed rate Mj and a weighted fraction of 
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the unused bandwicfth E(t). Thus, deperxling on the load on downstream buffers, the scheduler may or may not tse the 
secorKJ component, i.e., may or may not distribute the unused bandwidth. However, it may always corrtinue to provide 
the guaranteed minimum rate. 

[0031] While the above scheduler is capable of guaranteeing minimum rate and shaping the rate using the excess 
5 bandwidth, it lacks f lexibdity insofar as the distribution of the excess bandwidth is closely correlated to the assigned 
weights. From Eq. (6), it is clear that the rate at which stream i is served at time t ^ the weight Wj multiplied by the sum 
of the link capacity normalized by the sum of the weights over all streams, and the unused bardwidth normalized by the 
sum of the weights over all active streams. Hence, both the minimum guaranteed rate M| and the excess rate Ej(t) are 
proportional to Wj. This is not necessarily desirable from the network provider's point of view. The network provider may 
10 prefer to distribute the unused bandwidth in a different proportion than that of the minimum guaranteed rates Mj. 

[0032] Therefore, unless otherwise noted, the remaining description refers to the preferred enrtoodiment of the present 
invention wherein a novel dynamic rate control (DRC) is provided which ensures the guaranteed QoS and distributes 
unused bandwidth In an efficient manner decoupled from the minimum guaranteed rate. The inventive DRC Is not nec- 
essarily work conserving, but rather takes into account bottlenecks downstream in determining whether to allocate 
15 united bandwidth. 

[0033] In the case of a single bottleneck link shared by a set of traffic streams, the inventive DRC provides a minimum 
rate guarantee for each stream. Streams which do not make full use of their minimum rate guarantees (i.e. , streams with 
input rates less than their minimum rate guarantees) contritxite to a pool of excess bandwidth which is made available 
to streams which transmit in excess of their minimum rates. In DRC scheduling, the distribution of the excess bandwidth 

20 is determined by weights assigned to the streams. In contrast with weighted fair share schedulers, the share of the 
excess bandwidth which is made available to a stream in the inventive DRC is decoupled from the minimum rate guar- 
antees; i.e.. the share of the unused bandwidth need not be proportional to the assigned minimum rate guarantees. 
[0034] The DRC scheme also strives to provide the minimum rate guarantee on a short time-scale. That is, the DRC 
scheduler paces the cells of each stream queue such that the spacing between cdls belonging to the same stream is 

25 no smaller than the reciprocal of the minimum rate. If the connection admission control determines a certain minimum 
bandwidth requirement for a stream to meet a given QoS, the DRC scheduler should be able to deliver the required QoS 
by virtue of its ability to guarantee this minimum rate. Moreover, the DRC scheduler distributes unused bandwidth in a 
fair manner among the competing traffic streams. 

[0035] When there are multiple bottlenecks in a switch, the DRC scheme can eliminate congestion at all potential bot- 
30 tienecks for a given traffic stream. In contrast to the prior art weighted share schedulers, the inventive DRC can provide 
minimum rate guarantees even in the presence of multiple bottlenecks along a path within the switch. When there are 
multiple bottlenecks, the share of unused bandwidth given to a stream at a given bottleneck may also depend on the 
state of the downstream bottlenecks. In this case, rate feedback from each tx^ttteneck ^countered by a stream is used 
to choose the maximum rate at which a virtual channel (VC) can send without causing congestion. Furthermore, DRC 
35 can be extended beyond the switch in a hop-by-hop flow control scheme which can provide erKl-to-end QoS guaran- 
tees. (As is known in the art, the term virtual channel refers to a link of communication which Is established and main- 
tained for tiie duration of each cell. The link Is called virtual channel since, unlike synchronous transmission, there is no 
set channel designated to a particular caller.) 

[0036] In the inventive DRC scheme, the excess bandwidth is shared anrv>n^ competing users via the computation 
40 of dynamic rates In a closed-loop feedback loopi DRC scheduling also requires the Internal transmission of control Infor- 
mation within the switch. Notak}ly, the DRC scheme lends Itself to a relatively simple rate-shaping scheduler Implemen- 
tation. Unlike fair share schedulers based on timestamps. no searching or sorting Is required to find the smallest 
timestamp. 

[0037] The main features of the DRC scheduler are outiined below: 

45 

1 . Provides minimum rate guarantees for each stream. 

2. Allows flexible distribution of excess barxiwidth. The share of excess barxlwidtii can be detenmlned by: 

(a) Static weights according to traffic class (or other criteria) set by the connection admission control (CAC) and 
so may be called secorKlary or dass weights. Each class weight may be multiptied by the number of active virtual 

channels (VCs) belonging to the given class to achieve fairness with respect to VCs. 

(b) Dynamic weights determined according to ot>served quality-of-service (QoS) by a dynamic closed-loop 
control mechanism. 

55 3. Provides internal switch congestion control. This is advantageous especially for providing minimum rate guaran- 
tees without overflow of buffers. 

4. Allows extensibility to provide minimum rate guarantees on an end-to-end basis via hop-by-hop flow control. 
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Dynamic Rate Control Principle 

f?^!, section describes the principles behind the inventive dynamic rate control scheduling. Consider a set of 
N ATM cell streams multiplexed onto a link of capacity C. Each stream may correspond to a single virtual connection 
(VC). or a group of VCs belonging to the same traffic class, i.e., a group of VCs requiring the same QoS Associated 
with each stream is a queue which stores cells waiting for service, i.e., cells waiting to be transmitted over the link The 
function of a scheduler is to determine when the queued cells are to be serviced. 

St ensures the guaranteed QoS. In the preferred embodiment of the inventive 

PRC scheduling, the QoS guarantees are mapped onto minimum rate guarantees: however, it should be understood 
that other mappings may be used. That is. according to the preferred embodiment, the traffic characteristics (obtained 
from some combination of traffic descriptors and traffic measurement) and QoS requirements for stream i are mapped 
onto a rate. M, which is to be provided by the DRC scheduler If the mapping is done correctly, guaranteeing theTate 
Mj IS then tantamount to providing the QoS guarantee. It is therefore imperative for the scheduler according to the pre- 
ferred embodiment to be able to guarantee the minimum rate U^ for each stream. 

[0040] Fore the purposes of this discussion, it is useful to think of stream i as a group of VCs belonging to the same 
traffic class, with the same QoS requirements. An embodiment of a per VC queuing will be discussed in a later section 
For a giv«i traffic stream i, the bandwidth M,, required to meet cell loss and delay requirements can be computed based 
on the traffic parameters of the individual VCs and the buHer size. The multiclass connection admission control (CAC) 
scheme developed in G. Ramamurthy and Q. Ren, Multi-Class Connection Admission Control Policy for Hiqh Soeed 
ATM Switch^ in Proc. IEEE INFOCOM 97, Kobe. Japan. April 1997, provides procedures for computing M%r CBR 
VBR. and ABR traffrc classes based on the traffic parameters declared by individual VCs. The CAC described in that 
paper takes into account statistical multiplexing gain when there are many VCs belonging to a stream and can further 
be made more aggressive in its allocations by incorporating traffic measurements at the swrtch. Given the rate M for 
each stream, the most important requirement of a scheduler is to ensure that each stream receives service at rate M 
For stability of the system, clearly we must have the equation 1 noted above hold true. i.e.. the sum of all the individual 
rates must be less or equal to the rate of the common queue. 

[0041 ] In developing the theory behind the inventive Dynamic Rate Control (DRC) scheme, an idealized fluid model 
for the scheduler is assumed. The stream queues are simultaneously served at the dynamic rates Rj as fluid flows The 
input sfreams to the stream queues consist of discrete cells. Each cell brings a batch of work to the stream queue at 
which It arrives. The actual implementation of the DRC scheme according to the preferred embodiment is an approxi- 
mation to the Idealized model. 

Continuous-time Model 

[0M2] Let A(t)be the set of active streams at time t. A stream is considered active, if its corresponding stream queue 
IS backtogged. The most general form of the dynamic rate associated with stream i is given by 

R,(t)=Mi + <|.|(t)E(t), (11) 

where Mj is a minimum guaranteed rate. E(t) is the excess rate available to ail streams at the common bottleneck, and 
||»j(t) 6 [0,1] IS a normalized weighing factor. That is, the dynamic rate comprises two components: the guaranteed min- 
imum rate. M, and a part of the unused bandwidth. E(t). determine according to the weighing factor. *i(t), assigned to 

Ei(t) = (t.i(t)E(t). (12) 
as the variable component of the DRC rate for stream i. The excess rate is defined by: 

E(t)=C. X Mj. (13) 

jeA(t) 



wherein C is the rate of the common queue and Mj is the actual transmission rate of stream j of the set A(t) of active 
streams at trrne t The weights m. i e A{t) reflect how the excess bandwidth is to be allocated among the streams and 

55 are normalized such that 
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S*iW = 1 (14) 
ieAp) 



5 The weights ^(t) are normalized versions of positive weigtits Wi(t), i e A(t). i.e 

<t>i(t) = 



, i 6 A(t) 



(15) 

0 otherwise 



15 

Eqs. (11) and (13) define an idealized DRC scheduling scheme for the fluid model. The basic concept of the DRC 
scheme is illustrated in Rgure 2. 

[0043] As shown In Figure 2. each stream i has a queue. Q1 - QN, each of which being served at a dynamically var- 
iable rate R1 - RN. Each dynamic rate Ri is made of the guaranteed minimum rate Mi and a share of the unused band- 

20 width Ei, termed the DRC rate. Thef low from all the queues is fed to the common queue CO, which is served at a rate C. 
[0044] In practice, it is very difficult to track the set function A(t). since it can change with high frequency. Hence, it is 
impractical to compute the unused bandwidth via Eq. (13). If the set of traffic streams is reasonably large arvi the con- 
tribution from an individual stream is small relative to the aggregate stream, the statistical multiplexing gain wiU be high. 
Under these conditions, the rate at which E(t) changes by a significant amount should be much slower than the rate at 

25 which A(t) changes. This suggests that it is more feasible to track E(t) directly, than to track A(t) and then compute E(t) 
via Eq. (13). In DRC. E(t) is estimated uang a feedback control loop to be discussed more fully below. 

Discrete-time Row Model 

30 [0045] It is instructive to conskier a discrete-time model, where time is partitioned into intervals of length A. Assume 
that stream t anrives to its associated stream queue as a constant fluid flow of rate Xi(n) in the time interval 
Tn = (n A, (n + 1) A) . In this time interval, stream queue i is served at the constant rate R j(n) = M j + W jE(n) . where Wj 
is a fixed weight assigned to stream i. The output flow rate. Fj(n), from queue 1 during the interval T^, is then given by 

35 Fi(n) = min(Rj(n).Xj(n)) (16) 

That is. if the queue Is backlogged. the rate would be Rj(n); otherwise it would be the arrival rate X|(n). The aggregate 
flow rate to the bottleneck queue during T^ is then 

40 N 

F(n) = 2f'i(n)- 07) 



45 The excess bandwidth. E(n). over the interval is the sum of a static, unallocated portion of the bandwidth, and a 
dynamic part of the bandwidth that is currently not used by streams which transmit at less than their minimum guaran- 
teed rates: 

N N 

SO E{nMC''^Fi(n))+'£lMrX,{n)r (18) 

f-i 1-1 

where x"*^ A max (x.O). A^ia since it is difficult to obtain knowledge of the Input flow rates Xj(n), the present inventors 
55 developed an indirect mearis of computing E(n) via a control loop. 
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Closed-loop Control 

[0046] The excess bandwidth E can be estimated via a feedback control loop. By adjusting the value of E the aggre- 
gate flow rate to the bottleneck queue, denoted by F, can be controlled such that: 

1 . The length of the bottleneck queue is close to a target value Qq; or 

2. The average utilization at the bottleneck queue is close to a target utilization value Uq < 1 . 

[0047] Two control algorithms are disclosed herein for estimating E based, respectively, on matching a target queue 
length and a target utilization. Also disclosed herein is a hybrid control algorithm which combines the merits of the first 
two. 

Matching a Target Utilization 

[0048] Consider the continuous-time model of the scheduler. Let F(t) denote the aggregate flow rate into the botBe- 
neck queue at time t. We wish to control F(t) to achieve a target utilization Uq e (0,1). 
[0049] The following proportional control law can be used: 

F(t) = ao(F(t)-UoC). (19) 

[0050] That is. the rate of change of the aggregate flow rate is proportioned to the aggregate flow rate less than prod- 
uct of the target utilization and the rate of the common queue. 

[0051 ] The control system is stable for oq > 0 converges with exponential decay rate ao. In terms of the input streams 
the aggregate flow rate can be expressed as: 

F(t)= X (20) 

ieA(t) 

= Z "^i + EW (21) 

ieA(t) 

[0052] Taking derivatives on both sides (wherever F(t) is differentiable), we have 

F(t) = E(t). (22) 

Hence, the control law (19) can be re-written as: 

E(t) = ao[F(t)-UoC]. (23) 

[0053] The comrol law Eq. (23) is the basis for a method for estimating E(t) in the discrete-time model. The discrete- 
time form of Eq. (23). is as follows: 

E(n + 1) = E(n) -age (n), (24) 

where we define the error signal as e(n) = F(n) - U qC . Since the excess bandwidth must lie in the interval [0 C] the 
control law takes the form: 

E(n+1) = l[oc](E(n)-ao s (n)). (25) 

v'here 1 ,0 c,(x) = 1 if (x) equals or larger than zero, but equal or less than C; othenwise, I „ ^.(x) = 0 Over an interval 
in which the input fluid stream ftows are constant, the recursion in Eq. (25) will converge to the correct value for the 
ffitcess bandwidth E. The speed of convergence depends on the values of the coefficient oo and the sampling interval A. 
[0054] Rgure 3 shows a block diagram of the controller based on matching the target utilization Uq. The error is cal- 
culated by adder 10 and is provided to the controller 20. The controller 20 outputs the current excess bandwidth E(n) 
which IS fed back to the DRC scheduler 30. In practice, there is a delay in the feedback loop between the controller and 
the scheduler. However, within a switch, this delay y is typically negligible relative to the sampling interval A and can be 
ignored. The DRC scheduler allocates the excess bandwidth E(n) to the input streams (X,(n) - X„(n) accoreling to the 
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DRC scheduler disclosed above, which results in an aggregate flow rate F(n). Matdiir^ queue length 
[0055] Let Q(t) be the length of the bottleneck queue at time t and let Qq be a target queue length value. Assuming a 
fluid model of traffic, the queue ler^h grows according to the aggregate rate less the common queue rate: 

Q(t) - F{t) - C. (26) 

The proportional control law. 

F(t)=-ao(Q{t)-Qo). (27) 
leads to the equation 

Q(t) + aoQ(t) = a'o. (28) 

75 (See. e.g., L Bennrwhamed and S. Meerkov, Feedback Control of Congestion in Packet Switching Networks: The Case 
of a Single Congested Node, IEEE/ACM Trans, on Networking, vol. 1. pp. 693-708. December 1993.) The characteristic 
equation for (28) has a double root, implying non-decaying, oscillatory behavior of Q(t). This problem can be resolved 
by adding a derivative term to (27), resulting in the proportional-derivative (PD) controller 

20 F(t) « -ao(Q(t) 'Qo)-<t\ Q(t) (29) 

The corresponding differential equation governing the behavior of Q(t) Is: 

Q(t) + a\Q{X) + a'oQ(t) = a oQq. (30) 

25 

which is stable if a'o, a'^ > 0. The convergence rate can be assigned arbitrarily by appropriate choices for a'o and a\. 
From (29) and (21). the unused bandwidth can be obtained^s: 

E(t) = -a'o(Q(t) - Q o) - a\ Q(t) (31) 

30 

Then a discrete-time controller can be obtained from (31) as 

E(n + 1) = I (o.c](E(n) - a'o e (n) - a'^ g (n - 1) ) . (32) 

35 where e(n) = Q(n)-Qo is the error signal sampled at time n. Here. Q(n) is the queue length sampled at time t = nA . 
This controller attenrpts to keep the queue length near Qq, maintaining the utilization at 1 00%. 
[0056] Figure 4 showvs a block diagram of the controller. TTie target queue length Qq is subtracted from the queue 
length at time a Q(n), by the adder 14. to provide the enror E(n) to the controller 24. The controller 24 outputs the 
unused bandwidth E(n) to be feedback to the DRC controller. Again a delay y may be introduced, fcxit may be ignored 

40 when the DRC is uses within the switch. The DRC controller then allocates the availat>le bar^Jwidth to generate an 
aggregate flow rate F(n). 

Hybrid Control 

45 [0057] Clearly the disadvantage of matching the f bw rate to a target utilization. Uo. is that bandwidth is lost since Uo 
must be less than one. However, if queue length information is not readily availak>le, the control algorithm based on flow 
rate measurement is a viable alternative. The control algorithm based on queue length information achieves 100% uti- 
lization. A disadvantage of this algorithm, however, is that when the utilization is less than 1 00% the system is not con- 
trolled. If the utilization is less than 100%. the queue length is zero; hence. E(n) reaches the maximum value C. Now if 

50 the aggregate traffic flow irKre^es to a rate close to C, the queue may grow to a large value before the controller can 
bring the queue length to the target value Qq- 

[0058] If both flow rate and queue length information are available, the merits of both controller algorithms may be 
combined in a hybrid controller as follows: 

55 
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ifF(n)<UoCthen 

E(n+1) = E(n) - ao(F(n) - UqC) 

else 

E(n+1) = E(n) - a'o(Q(n) - Qo) - a',(Q(n-l) - Qo) 
end if 

ttenTfJ ThI "It "1 • b« ^^^^"^ each of the two controllers. Thus, when the utilization is less 

Overload Control 

[0060] The hybrid closed-loop controller adjusts the unused band«ndth E(n) to reduce the magnitude of an error signal 

[F(n) - UqC if F(n) < U.C 

(33) 




35 



40 



45 



50 
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where lx(x) denotes the indicator function on the set X. Wherein i(Q<Qij{U(n) ) = i if Q(n) < Q1. artd 
' {Q<Qij(Q(") ) = 0 otfienwise. Simaariy. I (q<q2}(Q{") ) = 1 if Q(n) < Q2. and I(q<q2)(Q{") ) = 0 othenwise. 

Multiple Bottlenecks 

5 

[0063] One notable application for the inventive DRC is in a multi stage ATM switch which has been already f Oed by 
the same applicant. In a multi-stage ATM switch, there are multiple bottleneck points. A stream may pass through sev- 
eral stages before reaching an output line of the switch. In the DRC approach, the streams are rate-controlled at the 
input stage to control congestion within the switch itself, as opposed to COTtrolling flow in the network Therefore, the 
JO bulk of the cell buffering occurs at the input stage of the switch. 

[0064] Consider an individual stream which passes through B bottlenecks. At the j-th bottleneck, a DRC rate E^)(n) is 
computed at the nth sampling interval. Define the overall t>ottleneck excess rate as: 

E * (n) =minE®(n) 1<j<B (35) 

75 

Let and denote, respectively, the shape and stop thresholds at the jth txsttleneck. Define the vectors: 

Qj = [Qj ^^1 < j ^ B]. i = 1.2, (36) 
20 [0065] Denote the queue length at the jth bottleneck at time n by Q^^n) and d^tne the vector: 

Q{n) = [Q%):1 ^ j ^ BJ. (37) 
Then, in analogy to Eq. (34). the dynamic rate for stream i for the multiple bottleneck case is computed as: 

25 

R ,(n) = [M, + l{Q<Qij(Q(n) ) • w^E * (n)] • l{Q,Q2j(Q(n) ) . (38) 

[0066] Figure 6 shows a set of stream queues. Q1 - QN. along with a set of bottleneck queues. At the l-th bottleneck 
queue, a DRC rate. Ej. is estimated based on flow and queue length information. For a given stream, e.g. ST3. the over- 
do all DRC rate is the minimum of the bottleneck rates for txmienecks traversed by the stream. From the perspective of the 
given stream queue, congesticxi in downstream bottlenecks are controlled and the queuing of cells is pushed upstream 
when congestion arises at one or more tx)ttlenecks. Ultimately, the congestion is pushed t>ack to the stream queue, 
where most of the queuing takes place for the stream. 

35 Rate-shaping Scheduler 

[0067] In order to inrpl^ent DRC. a mechanism for rate-shaping a nunrd^er of streams is necessary. Two implemen- 
tations of a scheduler which shapes a set of streams according to DRC rates are disclosed herein. The first is appro- 
priate when the number of streams is relatively small (on the order of a hundred or less), for example, when cells are 
40 qu^ed according to dass. The second implementation can handle a large nunr^er of streams (on the order of tens of 
thousarxls). but is slightly more complex. 

Scheduling for Rate-shaping 

45 [0068] DRC scheduling is Implemented using timestamps. A timestamp. TS. Is associated with each queue. A stream 
is active if and only id its cca-resporKling queue is non-empty. Otherwise, the stream is inactive. The DRC scheduler 
schedules only active streams for service. When a stream is served, the first cell in the associated queue is transmitted 
to the second stage queue and the status of the stream is updated. 

[0069] Two distinct time s ta mp computation formulas, are provided depending on whether a queue Is to be scheduled 
so or rescheduled. The timestamp computations ensure that each stream is shaped to the appropriate rate, as determined 
by the DRC scheme. 

Scheduling 

55 [0070] A given queue Is scheduled when the queue is ennpty and a new cell arrives to the queue; i.e. . when the asso- 
ciated stream changes from the inactive to the active state. The basic formula for computing the new timestamp for 
scheduling queue i Is given as follows: 
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TS ,. = max{CT. TS , + 1 / R ,(n)), jggj 

where CT is the current time. CT e{nA(n+1 )A). and R^n) is the dynamic rate for queue i at time n. 
Rescheduling 

icc!l- rescheduled after an active stream has been sen/ed and the stream remains active i p itc 

assoaated queue remains non-empty. In this case, the timestamp computation for reschedu^g queu^ fs • 

TSi=TSi + 1/Ri(n). 

Catching-up with current time 

SSn„™"^T ' " ^Ji!^ «™ CT il TS, s CT. Tl,» meam that the qu«.e cah !» 8en«J 

ieA(t) 

being scheduled. With the catch up provision, the scheduling and rescheduling procedures are as follows: 

Scheduling: 
ifTSi<CT- 1/Mjthen 

TSi = max{cT, TSi + 1 / M.} 

else 

TSi = maxfCT, TS^ + 1 / R.(n)j 

end if 



Rescheduling 
ifTSi<CT- 1/Mithen 
TSi < TSj + 1/Mi 

else 

TSi < TSi + 1/Ri(n) 

end if 
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Serving Ready Queues 

[0074] Serving a ready queue consists of transmitting the first cell in tfie queue to the bottleneck queue and. if nec- 
essary, rescheduling the queue for service. It is pc^sible that several queues may become ready at the same time. In 
practice, it is not possible to serve all ready queues (Airing one cell time. Thus, a collection of ready queues can farm. 
[0075] The ready queues can in turn be scheduled for service by means of a work-conserving scheduler. For exam- 
ple, ready queues could simply be served in round-robin fashion. Alternatively, a weighted fair share scheduler could 
be i^ed to ensure fair short time-scale bandwidth share among the ready queues. However, the improved bandwidth 
share does not warrant the considerable additional complexity required to implement weighted fair share scheduling. 
[0076] The preferred embodiment implements a round-robin with four priority levels, Isted t>elow in decreasing order 
of priority: 

1 . Dynamic high priority (HP) 

2. Real-time. Short CDV (RT-S) 

3. Real-time. Long CDV (RT-L) 

4. Non-real-time (NRT) 

The HP priority is a dynamic assignment- Ready queues which have been scheduled at their minimum guaranteed 
rates are automatically as HP. This ensures that all streams receive their minimum rate guarantees on a short time- 
scale. The remaining three priority levels are assigned statically, according to traffic class and tolerance for cell delay 
variation (CDV). Streams classified as RT-S are real-time streanns which have small CDV tderances, while RT-L 
streams have larger CDV tolerances. Non-real-time (NRT) streams generally do not have requirements on CDV. 
[0077] In general, low bit-rate real-time streams would be classified as RT-L, while high bit-rate realtime streams 
would be classified as RT-S. However, the CDV tolerance of a stream need not be directly related to its bit-rate. The 
static priority levels protect streams with small CDV tolerance from the bunching effects of streams with larger CDV tol- 
erances. For example, consider a scenario in which there are one thousand 75 kbps voice streams sharing a 150 Mbps 
link with a single 75 Mbps multimedia stream. Assuming that the multimedia stream is constant bit rate (CBR). it needs 
to send a cell once every two cell times. If cells from the voice streams are bunched together (at or near the same times- 
lot), the multimedia stream will suffer from severe CDV. relative to its inter-cell gap of one cell time. In the worst-case, 
two cells from the multimedia stream could be separated by up to one thousand voice cells. 

Per Class Queuing 

[0078] In the case of per class queuing, when the number of streanns is relatively small, the scheduler can be inple- 
mented with a parallel array of comparators. The ith comparator takes as inputs CT and TSj and evaluates 



[0079] Queues with fj = 1 have had their timestamps expire and hence are ready for service. These queues are served 
using round-robin with priority based on a priority flag Pj assigned as follows: 




ICT > TSi and queue i active 
OCT < TSi 



(42) 



Pi = 



1 

2 



if queue 
if queue 
if queue 
if queue 



1 



1 



scheduled at rate Mi 
is RT-S 
is RT-L 



is NRT 



(43) 



[0080] A logical view of the scheduler is illustrated in Figure 7. The scheduler per forms a round-robin search for a 
queue i satisfying fj = 1 arwl Pp 0. If no such queue exists, the round-rokan search continues by searching for queue i 
satisfying f ;= 1 and Pi = 1 . The process continues until either a queue is found, or all priority levels have been searched. 
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[0081 ] In per VC queuing, the number of queues is on the order of tens of thousands In this case it is not Pron^mi.., 
SrSrT;:?ri?^tT'"K"""" °* ^ sr^ullr^sS on a timeX^^^^^ 

ZT^ Wc S^^^^': "^T^ ®' '^^^l *° « tour linked lists (one for eSi p ioriS 

level) Of VC Identifiers whose timestamps correspond to the bin label. During each time slot the current tlmp rT 

:?.n^r.=i::sjs^^^^^^^ 



Providing Multi -class Quality-of Service 

ISffi, *° ^'"^"^^ quality-of-service (QoS). a connection admission control (GAG) algorithm is necessarv to 

?na vS SRcth^' T ^"""J"" ^"''^^ ^""^ ^'^^'"^ requir Jent o! the n^ vLSme^° 
^ scheduling s.mplrf.es the GAG function by providing a direct mapping between the bandwd*7«u1rJ^n 

provKJe QOS to a stream and the rate at which the stream is schiuled for se^eT^SiriS^^e ZirS. ! 

hi J^ir ^?T u"^* " ^'^^ able to provide the minimum guaranteed rate Zen th^QoS^r 
LTdtsorSv" 

o^l m^°l*h °* P'-«*«^e: however, from an implementation point of view 

«lh S^'^T ^^^^^^ •* '^^'^ a 'a^9e number of queues. The main advantage of per VC q^^uVng s tS 

• ^"^"^t'*^" described for the case wherein a^ic weight, w, is assigned to each class by the CAC 
The value of the weight w, determines the share of the free bandwidth that is allocated to class iThS^^fr r «S 

d™^ ^^'^"^ "^''^ '^^"^-'-"^ °' « closeJ^pt^rbaJ^LnV^^^^^ 

Per Class Queuing 

?Se«m^^^^l^r\r H^^"** ''^ "''responding to class i is an aggregate of several individual 

weight, w, The dynamic rate for stream i at a single bottleneck point is computed as 

R,(n) = M,*^E(n). ^^^j 

jsA(t) 

v?s":g':.^rsur'="'"'^^'^"^^'°-"'-^«*^'^^ 



S(n)= X wj. (45) 



in Eq. (43) will be discussed further below. 

^" the number of active VC streams composina stream i Bv 

setting w , = n ^(n) . the unused bandwidth can be distributed fairly with respect to the indrvS V^T^n mis 2?e l 
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Per VC Queuing 

[0087] In per VC queuing, for N VC there are N VC queues. Each VC queue belongs to one of K dasses. Let C(i) 
denote the class to which VC i belongs and let C^ denote the set of VCs belonging to dass k. VC i b assigned a mini- 
5 mum guaranteed bandwidth M^. which is sufficient to guarantee QoS for all VCs in class k. When VC i in dass k 
becomes inactive, the unused bandwidth Mj is first made available to VCs belonging to dass c(i). up to the class guar- 
anteed bandwidth M}^. and then to the VCs in the other dasses. 
[0088] In the per VC paradigm a dynamic rate is computed for VC i as follows: 

'0 Rj(n) = Mi + E^j/n). (46) 

where E^(n) denotes the estimated unused bandwidth for class k at the sampling instant n. We assume that the flow 
rate at time n, Fk(n), of dass k cells into the common queue can be measured. Also, we assume that it is possible to 
count the number. Qk(n). of class k cells in the common queue at time n. The common queue length is given by 

75 

K 

k=1 

[0089] The excess bandwidth at the bottleneck point, denoted E(n). Is estimated as a function of the aggregate flow 
rate. F(n), and the common queue length. Q{n). It can be computed using the hybrid PD controller discussed earlier: 

ifF(n)<UoC then 

E(n+1) =E(n) - ao(F(ji) - UqC) = a,(F(n-l)-UoC) 

else 

E(n+1) =E(n) - a'o(Q(n) - Qo) = a'i(Q(n.l)"Qo) 
end if 

35 [0090] The minimum guaranteed class bandwidth. M^, is determined by the CAC. The dynamic rate for class k is com- 
puted as: 

R k(n) = M ^ + w ,,n K(n)E(n). (48) 

40 where w^ is the weight for class k and nk(n) is an estimate for the number of active VCs belonging to dass k. 

[0091] The dynamic class rate Rk(n) represents the bandwidth available to VCs in class k. For VCs in class k. the 
unised bandwidth is computed with respect to Rk(n). The unused bandwidth for class k. Ek(n). can be computed using 
the hybrid PD controller as follows: 

ifFw(n)<W*'^Rk(n)then 

Ek(n-f 1 )=Ek(n).ao(Fk(n)-Uo^'^ Wn))-a,(Fk(n- 1 )-Uo^'^Rk(n- 1 )) 
else 

Eic(n+ 1 )=E,(n)Kx'o(Qk(n)-Qo^'^-a' ,(Q,(n- 1 yQa^""^) 

end if 
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55 



Here. Uq^^ and Qq^^ are, respectively, the target utilization and target queue length for class k. 

[0092] Thus, the above novel per VC queuing acconpli^es a two-tier distribution of tf^ unused bandwidth. A first 

distritxition according to dass anxi a second distribution according to VCs within the class. 
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[0M3] In equation (46) it is implied that E,(i)(n) is distributed evenly among the active streams within the class Ho«, 
ever, continuing with the DRC theme, one may elect to have variable distribution This SnTLs Iv a^^orSSnl^^^ 
■ntroducng a we.ght factor, e.g.. ^, assigned to each VC within the class. Thus, the rSe^uid Sto^^S^s ' 

Ri(n) = Mi + <|,jE,(Q(n). 

Closed-loop QuaWy-of-Service Control 

nl^Jlf I^^ ^^"T ^""^^ bandwidth, E(n). is computed via a feedback control loop This excess band- 

ass»nep=,dass,.e«,9.=l«K„,gh«,e5ame mahoaolog/ applies und» per vc^,^ 
Concept 

[0095] As discussed in the previous section, s GAG function is necessary to auarantee OoS Hr*««,or ^ « 

?Ari^ eTos'sr '""^ -"^ -^^^ shrrerro^^o'ns^to^-^^^^^^^^ : rS mS; s 

mSoT,^^ °°P ^'^ coni^nction with a dynamic CAC ^ 

SJPtarl^t^o^fm'^^ ■ P^°P°rtional to the deviation between the observed QoS at time t 

^eue L The nom«lized deviation of the observed QoS from thTtarget QoS is g!ven by- ' ^ 



Since the 



weights 

should always be positive, we assign them as follows: 



(49) 



width based on the rato of rts perceived QoS to its target QoS. In this way. a stream' suffering from^r Sl^^. 
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icaity takes more of the availatjie excess bandwidth compared with a stream which is meeting or exceeding its target 
QoS. Closed-loop QoS can make short-term corrections to errors in bandwidth allocation made by ttie CAC. 

QoS Measurement 

5 

[0100] In a real implementation, the QoS of a stream must be measured over a time interval. Updates to the weights 
take place at discrete times. A reasonaljle scheme is to re-compute the weights directly either before or after the com- 
putation of the excess bandwidth. As an example, the average cell delay, D(n), in the first stage queue for a stream can 
be measured over the time interval (nA, (n + Average queue length is also a relatively simple weighting function 
10 which can be used. 

[0101] On the other hand, it is very difficult to estimate cell loss provability over a relatively short period of time. This 
QoS metric can only t>e estimated over a longer time interval. A dynamic CAC might be able to estimate cell loss prob- 
ability and to allocate sufficient t^andwidth to correct the cell loss probability for the next measurement interval, based 
on otjservations of the traffic over the current interval. 

75 

Congestion Control via Dynamic Rate Control 

[0102] In this section, we discuss how DRC can be used to corrtrol congestion at bottienecks within a multi-stage 
switch. We then discuss how DRC can be extended beyond the local switch to provide hop-by-hop congestion control. 

20 

Input-Output Buffered Switch 

[01 03] The prior art, weighted fair share schedulers distribute bandwidth weighted with respect to a single bottieneck. 
Minimum rate guararrtees can be provided with respect to this bottieneck. However, if there is a second, downstream 
25 bottieneck. the prior art weighted fair share scheduler may not be able to provide bandwidth guarantees. The input 
sti-eams to tiie second-stage bottieneck may originate from different bottleneck points in the first stage. If these first 
stage bottienecks are scheduled independentiy by weighted fair share schedulers, congestion at the common second- 
stage bottieneck may arise, resulting in the tass of rate guarantee. 

[0104] Figure 9 shows an exanple of a hypothetical N x N input-output buffered swvitch. Input and output modules. 

30 Iml - Imn and Omi - OMN, respectfully, are connected to a core switching element 1 01 having a central high speed bus 
120 (e.g., a time-division multiptexed bus). Each input module has a scheduler which schedules cells to be f ansmitted 
over the bus 1 20. Each output module consists of buffers RT1 - RTN which operate at the speed of the bus, I.e.. N times 
the line speed. When the output buffer occupancy reaches a certain threshold, a signal is broadcast to all input mod- 
ules. The signal causes all input modules to throttie tiie flow of traffic to the given output nnodule. This prevents buffer 

35 overflow at the output module. 

[0105] Conskjer two streams. Si and S2. originating from different Irput modules and destined to tiie same output 
module. Assume tiiat both streams are continuously bacWogged and suppose they are scheduled using weighted fair 
share schedulers. Since tiie schedulers are work conserving, the output cell rate from each input module will be equal 
to the line rate C. The output module buffer level will eventually exceed the backpressure threshold. The t^ackpressure 

40 Signal will throttie both input modules until the buffer occupancy at tiie output nKXlute falls below the stop threshold. The 
throughput received by each stream will be 0.5C. With weighted fair schedulers, it Is not possible to achieve different 
througfiputs tor the two sti-eams. This is because the schedulers are work conserving with respect to the first stage bot- 
tieneck. 

[0106] On the other hand, if DRC schedulers are employed at tiie input modules, the output streams from the Input 
45 modules can be shaped to different rates. For example, suppose M^ = 0.1 C and Mg = 0.8C. Then the excess bandwidth 
at the output module bottleneck is O.IC. If this excess bandwkith is distritnited evenly between the two sti^eams, the 
throughputs for tiie two streams will be Ri = 0.15C arKj R2 = 0.85C, respectively. 

Hop-by-Hop Dynamic Rate Control 

50 

[0107] Dyramic Rate Corrtrol can be extended beyond the local switch if cells are queued per VC. If the downstream 
switch is able to transmit rate information on a per VC basis, e.g.. through resource management (RM) cells. The down- 
sti-eam rate can be used by the local DRC scheduler to schedule VCs as follows: 

. R = "^'"(R local. R dowmsiream)- 

[0108] While tiie invention has been described with reference to specific embodiments thereof, it will be aw>reciated 
by those skilled in the art that numerous variations, modifications, and embodiments are possikile. and accordingly, all 
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such variations, modifications, and embodiments are to be rega«Jed as being within the spirit and scope of the inven- 

S thp appreciated that while the above disclosure is provided in terms of scheduling cells in an ATM 

swrtch. the inventive dynamic rate control (DRC) can be implemented for scheduling data packets in packet 9«itchina 
For example. Equations 39 and 40 can be easily modified to account for the packet's length aVSdl^ ^' 

TSi = max i CT, TSj+L/Ri(n) (39' ) 



TSi = TSi + URi(n). 



(40). 



wherein L represents the length of the packet at the head of the queue being scheduled 

Sl thl r.^^^ the above-merrtioned description has been made about the switch which has input and output buff- 
mon Sr " T '^T^ to this switch but is also readily applicable to an ATM switch which hSa «Sm- 

«uTr^f!^ r T ''"^^ *° scheduled is logically formed in the common buffer. In addition, the minimum 
guaranteed rate and the dynamic rate may be selected as the shaping rate minimum 

thlHi J^HlTZT *® ^'^^ by determining each rate of the queues on the basis of 

reL^v^t !*''*T* ' ZT!"" O^^ranteea rate and the excess bandwidth which are fix^ and variableSS^ 
respectively Asa result, the scheduler according to the present invention can respond to services based on the ^riTs 
QoS requirements and can effectively shape the rate on occurrence of a downsfream bottleneck 



Claims 
1 



ooTUZa' scheduling of a plurality of cells arriving at an ATM switch having a plurality of queues. 



2. 



comprising: 

directing each one of said plurality of cells to a respective queue; 
assigning a respective minimum rate guarantee for each of said queues; 
assigning a respective excess rate share for each of said queues; 
estimating excess bandwidth on a downstream link; 

irrJl?K!Sl'^^^^^^ ''"^"^^ ^^"^'"9 *° respective mininuim rate guarantee, while distributing 
the excess bandwidth to said queues according to said excess rate share. 

A method of rate-based scheduling at an ATM switch having a plurality of Input queues, comprising the steps of: 

assigning a minimum guaranteed rate for each of said queues; 
computing a variable rate tor each of said queues* 

tTSSve^^^e^me"" ''^'''^ «° ^-P«=th,e minimum guaranteed rate and 

sf^ of""'"''" "^"^ ""^'^ " f""^^"^ "^^^ '"««^«^ '"rther 

monitoring the level of each of said buffers and. when one of sakl buffers reaches a predetermined level aen- 
eratng a shape signal identifying said one buffer; K-«»eierminea levei. gen 

''"f T ^"^""^'T^ *° ^ composed of the minimum guaranteed rate plus the vari- 

^he^iestnaTJg^^^^^^^^ 

4. A method of rate-based cell scheduling of a plurality of cell streams, comprising the steps of: 

assigning a minimum rate for each of said plurality of cell streams- 
calculating a dynamic rate for each of said cell streams, said dynamic rate comprising a product of an assianed 
weight and an estimated excess bandwidth at a downstream bottleneck- and 
adding each dynamic rate to a corresponding minimum rate. 
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5. The methcxj of daim 4. wherein the excess bandwkith is estimated via a feedback control loop. 

6. The method of claim 4. wherein said assigned weight Is static. 

5 7. The method of claim 4. wherein said assigned weight is dynamic. 

8. A method for shaping transmission rate of a eel! stream arriving at a buffer, corrprising the steps of: 

monitoring the level of said txrffer; 
^0 when the level of said buffer reaches a first predetermined thresJwId reducing the transmission rate of said 

stream to a preassigned minimum rate; and 

when the level of said buffer reaches a second predetermined threshold halting transmission of said stream. 

9. A method for queuing a plurality of virtual channels in an ATM switch having a plurality of input buffers, comprising 
15 the steps of: 

assigning an input buffer for each of said virtual channels; 
assigning a minimum guaranteed rate for each of said buffers; 
assigning a weight for each of said buffers: 
20 calculating a dyr>amic rate for each of said buffers, said dynamic rate comprising the minimum guaranteed rate 

plus a portion of an unused bandwidth of said switch, said portion being proportional to the assigned weight; 
and 

shaping transmissions from said buffers according to the dynamic rate. 
25 10. The method of claim 9, wherein each of said buffer is assigned to only one virtual channel. 

1 1 . The method of daim 9. wherein each of said buffers is assigned to a plurality of virtual channels having similar qual- 
ity of service requirements, further comprising the step of: 

30 distributing the dynamic rate of each buffer to its respective active virtual channels. 

12. The method of claim 1 1 , wherein said dynamic rate is distributed evenly among the respective active virtual chan- 
nels. 

35 1 3. The method of claim 1 1 , wherein each of said virtual channels is assigned a secondary weight determined for eadi 
class of the service requirements and wherein the dynamic rate is distributed to the respective virtual channels 
according to the respective secondary weight 

14. A method for controlling overload in a buffer, conprising: 

40 

monitoring a load level in said buffer; 

when said load level reaches a first threshold, generating a shape signal to cause input to said buffer to be 
reduced to a minimum level; 

when said load level riches a second threshold, generating a stop signal to halt any input to said buffer. 

45 

15. The method of daim 14, further conprising the steps of: 

estimate an unused t^arrdwidth available on said buffer; 
generating a signal indicating said estimate. 

A method of rate-t>ased scheduling of a plurality of data packets arriving at a switch having a plurality of queues, 
corrprising: 

directing each one of said plurality of packets to a respective queue: 
55 assigning a respective minimum rate guarantee for each of said queues: 

assigning a respective excess rate share for each of said queues; 
estimating excess bandwidth on a dowvnstream link; 

transmitting said packet from said qu^es according to the respective mininuim rate guarantee, while distrib- 
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16. 
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uting the excess bandwidth to said queues according to said excess rate share. 

17. A method of rate-based scheduling at a switch having a plurality of input queues, comprising the steps of: 

assigning a minimum guaranteed rate for each of said queues; 
computing a variable rate for each of said queues; 

shaping each packet stream arriving at each of said queues according to the respective minimum guaranteed 
rate and the respective variable rate. 

18. A method of scheduling a plurality of queues in a switch which has a plurality of input ports and a plurality of output 
ports and which is operable to switch between the input and the output ports, the method comprising the steps of: 

calculating a first schedule factor predetermined for each of the queues; 

calculating a second factor which is relatively determined in dependency tpon a relationship between each 
queue and the remaining queues; and 

determining whether scheduling operation of each queue is carried out only by the first schedule factor or by 
both the first and the second schedule factors. 

19. A method as claimed in claim 18. wherein the first schedule factor is specified by a minimum guaranteed rate for 
each of said queues while the second schedule factor is specified by a variable share rate which is determined by 
the relationship of rates between each queue and the remaining queues. 

20. A method as claimed in claim 19. wherein the variable share rate which is given as the second schedule factor 
depends on a weight allocated to each queue. 

21. A method as claimed in claim 19. wherein the variable share rate which is given as the second schedule factor is 
determined by detecting a rate of a common queue which hs common to the respective queues, by determining a 
minimum guaranteed rate of the queues using the common queue, by adding the rate of the common rate to the 
minimum guaranteed rate to obtain a sum rate between the rate of the common rate and the minimum guaranteed 
rate, by calculating an unused rate of the common queue on the basis of the sum rate, and by distributinq the 
unused rate. 

22. A method as claimed in daim 21 . wherein the unused rate is determined in relation to a weight predetermined bv 
each of the queues. 

23. A method as claimed in daim 21 . wherein the switch is specified by an ATM switch while the weight is dynamically 
determined on the basis of sen^ice dasses related to connection admission control carried out in the ATM switch 
and current active queues. 

24. A method as daimed in daim 23. wherein the unused rate is calculated by the use of a target utilization in the com- 
mon queue and a total flow rate in the common queue. 

25. A method as claimed in claim 23. wherein the unused rate is calculated by the use of a relationship between a tar- 
get queue length of the common queue and a current queue length. 

26. A method as claimed in claim 18, wherein the switch is formed by an ATM switch which has at least one of input 
and output buffers; 

the scheduling operation being carried out in relation to the queues included in each buffer. 

27. A method as claimed in claim 18, wherein the switch is formed by an ATM switch which has a common buffer; 

the scheduling operation being carried out in relation to the queues induded in the common buffer. 

28. A method of rate-based scheduling at an ATM switch having a plurality of input queues, comprising the steps of: 

assigning a minimum guaranteed rate for each of said queues; 
computing a dynamic rate for each of said queues; and 

selectively shaping each stream arriving at each of said queues according to the minimum guaranteed rate and 
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the dynamic rat& 

29. A met^vxj as claimed in claim 28. wherein the dynamic rate is determined by the minimum guaranteed rate and a 
variable rate conputed for eac^ of said queues. 
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CONTROLLER BASED ON MATCHING TARGET UTILIZATION 
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CONTROLLER BASED ON MATCHING TARGET QUEUE LENGTH 
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DRC SCHEDULING WITH OVERLOAD CONTROL 
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RATE-SHAPING SCHEDULER STRUCTURE FOR PER CLASS QUEUEING 
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RATE-SHAPING SCHEDULER STRUCTURE FOR PER VC QUEUEING 
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